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Abstract 

We present a technique for the automated verification of abstract models of multithreaded 
programs providing fresh name generation, name mobility, and unbounded control. 

As high level specification language we adopt here an extension of communication finite- 
state machines with local variables ranging over an infinite name domain, called TDL 
programs. Communication machines have been proved very effective for representing com- 
munication protocols as well as for representing abstractions of multithreaded software. 

The verification method that we propose is based on the encoding of TDL programs 
into a low level language based on multiset rewriting and constraints that can be viewed as 
an extension of Petri Nets. By means of this encoding, the symbolic verification procedure 
developed for the low level language in our previous work can now be applied to TDL 
programs. Furthermore, the encoding allows us to isolate a decidable class of verification 
problems for TDL programs that still provide fresh name generation, name mobility, and 
unbounded control. Our syntactic restrictions are in fact defined on the internal structure 
of threads: In order to obtain a complete and terminating method, threads are only allowed 
to have at most one local variable (ranging over an infinite domain of names). 

KEYWORDS: Constraints, Multithreaded Programs, Verification. 



1 Introduction 

Andrew Gordon (fGordon 2f)01j) defines a nominal calculus to be a computational 
formalism that includes a set of pure names and allows the dynamic generation of 
fresh, unguessable names. A name is pure whenever it is only useful for comparing 
for identity with other names. The use of pure names is ubiquitous in programming 
languages. Some important examples are memory pointers in imperative languages, 
identifiers in concurrent programming languages, and nonces in security protocols. 
In addition to pure names, a nominal process calculus should provide mechanisms 
for concurrency and inter-process communication. A computational model that pro- 
vides all these features is an adequate abstract formalism for the analysis of multi- 
threaded and distributed software. 
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The Problem Automated verification of specifications in a nominal process calculus 
becomes particularly challenging in presence of the following three features: the pos- 
sibility of generating fresh names (name generation); the possibility of transmitting 
names (name mobility); the possibility of dynamically adding new threads of control 
(unbounded control). In fact, a calculus that provides all the previous features can 
be used to specify systems with a state-space infinite in several dimensions. This 
feature makes difficult (if not impossible) the application of finite-state verification 
techniques or techniques based on abstractions of process specifications into Petri 
Nets or CCS-like models. In recent years there have been several attempts of ex- 
tending automated verification methods from finite-state to infinite-state systems 
dAbdulla and Nylen 2000|fKesten et al. 2 001). In this paper we are interested in in- 
vestigating the possible application of the methods we proposed in iDelzanno 200 1|) 
to verification problems of interest for nominal process calculi. 

Constraint-based Symbolic Model Checking In IjDelzanno 2001|l we introduced a 
specification language, called MSR(C), for the analysis of communication protocols 
whose specifications are parametric in several dimensions (e.g. number of servers, 
clients, and tickets as in the model of the ticket mutual exclusion algorithm shown 
in (Bozza no and Delzanno 2 002)^1. MSR(C) combines multiset rewriting over first 
order atomic formulas IjCervesato et al. "199 91 with constraints programming. More 
specifically, multiset rewriting is used to specify the control part of a concurrent 
system, whereas constraints are used to symbolically specify the relations over lo- 
cal data. The verification method proposed in (Delzanno 2005) allows us to sym- 
bolically reason on the behavior of MSR(C) specifications. To this aim, following 
IjAbdulla et al. 19961 |Abdulla and Nylen 2000 ) we introduced a symbolic represen- 
tation of infinite collections of global configurations based on the combination of 
multisets of atomic formulas and constraints, called constrained configurations. 1 
The verification procedure performs a symbolic backward reachability analysis by 
means of a symbolic pre-image operator that works over constrained configurations 
UDelzanno 2005|> . The main feature of this method is the possibility of automatically 
handling systems with an arbitrary number of components. Furthermore, since we 
use a symbolic and finite representation of possibly infinite sets of configurations, 
the analysis is carried out without loss of precision. 

A natural question for our research is whether and how these techniques can be 
used for verification of abstract models of multithreaded programs. 

Our Contribution In this paper we propose a sound, and fully automatic verification 
method for abstract models of multithreaded programs that provide name genera- 
tion, name mobility, and unbounded control. As a high level specification language 
we adopt here an extension with value-passing of the formalism of (jEall et al. 200 l|l 

1 Notice that in ( Abdulla ct al. 1996 Abdulla and Nylen 2000 1 a constraint denotes a symbolic 
state whereas we use the word constraint to denote a symbolic representation of the relation of 
data variables (e.g. a linear arithmetic formula) used as part of the symbolic representation of 
sets of states (a constrained configuration). 
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based on families of state machines used to specify abstractions of multithreaded 
software libraries. The resulting language is called Thread Definition Language 
(TDL). This formalism allows us to keep separate the finite control component of a 
thread definition from the management of local variables (that in our setting range 
over a infinite set of names), and to treat in isolation the operations to generate 
fresh names, to transmit names, and to create new threads. In the present paper we 
will show that the extension of the model of (Ba ll et al. 2001(1 with value-passing 
makes the model Turing equivalent. 

The verification methodology is based on the encoding of TDL programs into a 
specification in the instance MSRatc of the language scheme MSR(C) of( Delzanno 2001 ). 
MSRjvc is obtained by taking as constraint system a subclass of linear arithmetics 
with only = and > relations between variables, called name constraints (NC). The 
low level specification language MSRatc is not just instrumental for the encoding 
of TDL programs. Indeed, it has been applied to model consistency and mutual 
exclusion protocols in (|Bozzano and Delzanno 20021 IDelzanno 200511 . Via this en- 
coding, the verification method based on symbolic backward reachability obtained 
by instantiating the general method for MSR(C) to NC-constraints can now be ap- 
plied to abstract models of multithreaded programs. Although termination is not 
guaranteed in general, the resulting verification method can succeed on practical 
examples as the Challenge-Response TDL program defined over binary predicates 
we will illustrated in the present paper. Furthermore, by propagating the sufficient 
conditions for termination defined in ( Bozzan o and Delzanno 20 02 , Delz anno 2005|l 
back to TDL programs, we obtain an interesting class of decidable problems for ab- 
stract models of multithreaded programs still providing name generation, name 
mobility, and unbounded control. 

Plan of the Paper In Section [21 we present the Thread Definition Language (TDL) 
with examples of multithreaded programs. Furthermore, we discuss the expressive- 
ness of TDL programs showing that they can simulate Two Counter Machines. In 
Section |3 after introducing the MSRatc formalism, we show that TDL programs 
can be simulated by MSRtvc specifications. In Section 0] we show how to transfer 
the verification methods developed for MSR(C) to TDL programs. Furthermore, we 
show that safety properties can be decided for the special class of monadic TDL 
programs. In Section [5] we address some conclusions and discuss related work. 

2 Thread Definition Language (TDL) 

In this section we will define TDL programs. This formalism is a natural extension 
with value-passing of the communicating machines used by l|Ball et al. 2001(1 to 
specify abstractions of multithreaded software libraries. 

Terminology Let AT be a denumerable set of names equipped with the relations = 
and ^ and a special element _L such that n ^ _L for any n 6 M . Furthermore, let 
V be a denumerable set of variables, C — {c%, . . . ,c m } a finite set of constants, and 
C a finite set of internal action labels. For a fixed V CV, the set of expressions is 
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defined as £ — V U C U {_L} (when necessary we will use £ (V) to explicit the set of 
variables V upon which expressions are defined). The set of channel expressions is 
defined as £ c h = V UC. Channel expressions will be used as synchronization labels 
so as to establish communication links only at execution time. 

A guard over V is a conjunction 71, . . . , 7 S , where ji is either true, x — e or x ^ e 
with x G V and e £ £ for i : 1, . . . , s. An assignment a from V to W is a conjunction 
like Xi := ej where Xi G W, G for i : 1, . . . k and x r ^ x s for r ^ s. A 

message template m over V is a tuple m = (xi, . . . , x u ) of variables in V. 

Definition 1 

A TDL program is a set T = {Pi, . . . , P t } of thread definitions (with distinct names 
for local variables control locations). A thread definition P is a tuple (Q,so,V,R), 
where Q is a finite set of control locations, s G Q is the initial location, V C V is 
a finite set of local variables, and R is a set of rules. Given s, s' G Q, and a G C, a 
ru/e has one of the following forms 2 : 

• Internal move: s s'[y, a], where 7 is a guard over V, and a is an assign- 
ment from V to V; 

• Name generation: s s'[x := new], where x G V, and the expression new 
denotes a fresh name; 

• Thread creation: s — ► s'[run P 1 with a], where P' = (Q' , t, W, R') G T, and 
a is an assignment from V to W that specifies the initialization of the local 
variables of the new thread; 

• Message sending: s e ' m > s'py, a], where e is a channel expression, m is a ?nes- 
sa(/e template over V that specify which names to pass, 7 is a guard over V, 
and a is an assignment from to V. 

• Message reception: s e ' ? "> s'[7, a], where e is a channel expression, m is a 
message template over a new set of variables V (V' fl V" = 0) that specifies 
the names to receive, 7 is a guard over V UV' and a is an assignment from 
V U V to V. 

Before giving an example, we will formally introduce the operational semantics of 
TDL programs. 

2.1 Operational Semantics 

In the following we will use N to indicate the subset of used names of J\f. Every 
constant c G C is mapped to a distinct name n c ^ ± G N, and _L is mapped to T. 
Let P = (Q,s,V,R) and V — {x\, . . . ,Xk}- A local configuration is a tuple p = 
(s', m, . . . , rife) where s' & Q and iij £ JV is the current value of the variable Xj G V 
for i : 1, . . . , k. 

A global configuration G — {N,pi, . . . ,p m ) is such that N C TV and pi, . . . ,p m arc 
local configurations defined over iV and over the thread definitions in T. Note that 

2 In this paper we keep assignments, name generation, and thread creation separate in order to 
simplify the presentation of the encoding into MSR. 
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there is no relation between indexes in a global configuration in G and in T; G is 
a pool of active threads, and several active threads can be instances of the same 
thread definition. 

Given a local configuration p = (s',ni, . . . ,nk), we define the valuation p p as 
Pp{xi) = n i if G V, p p (c) — n c if c G C, and p p (-L) = -L. Furthermore, we 
say that p p satisfies the guard 7 if ^(7) = true, where p p is extended to con- 
straints in the natural way (p p ((fii A ^2) = Pp(fi) A p p {ip2), etc.). 
The execution of x := e has the effect of updating the local variable a: of a thread 
with the current value of e (a name taken from the set of used values TV) . On the con- 
trary, the execution of x := new associates a fresh unused name to x. The formula 
run P with a has the effect of adding a new thread (in its initial control location) 
to the current global configuration. The initial values of the local variables of the 
generated thread are determined by the execution of a whose source variables are 
the local variables of the parent thread. The channel names used in a rendez-vous 
are determined by evaluating the channel expressions tagging sender and receiver 
rules. Value passing is achieved by extending the evaluation associated to the cur- 
rent configuration of the receiver so as to associate the output message of the sender 
to the variables in the input message template. The operational semantics is given 
via a binary relation => defined as follows. 

Definition 2 

Let G = (N, . . . , p, . . .), and p = (s, m, . . . , rife) be a local configuration for P = 
(Q, s, V, R), V = {xi, . . .,Xk}, then: 

• If there exists a rule s s'[j, a] in R such that p p satisfies 7, then G =>■ 
(N, . . . , p', . • ■} (meaning that only p changes) where p' = (s', n[, . . . , n' k ), 
n 'i = Pp( e i) ^ x i '■— e i i s i n a i n 'i = n i otherwise, for i : 1, . . . , k. 

• If there exists a rule s -^-> s'[xi :— new] in R, then G =>■ {N' , . . . ,p', . ■ •) 
where p' = (s 1 , n[, . . . , n' k ), n, is an unused name, i.e., n' { G Af\N, n'^ = nj 
for every j ^ i, and N' = N U {n^}; 

• If there exists a rule s s'[run P' with a] in R with P' = (Q',to,W,R'), 
W = {yi, . . . , y u }, and a is defined as y\ := e\,...,y u := e u then G => 
(N, . . . , p', . . . , q) (we add a new thread whose initial local configuration is q) 
where p' = (s',ni, . . . ,n k ), and q = (t ,p p (ei), . . .,p p (e u )). 

• Let q = (t, mi, . . . , m r ) (distinct from p) be a local configuration in G asso- 
ciated with P' = (Q 1 , t 0} W, R'). 

Let s s'[y, a] in R and t ' m > t'[Y,a'} in R' be two rules such that 
m = (xi, . . . , x u ), m! — (yi, . . . , y v ) and u = v (message templates match). 
Wc define a as the value passing evaluation a(yt) = p p (xi) for i : l,...,u, 
and er(z) = p q (z) for z G W . 

Now if jOp(e) = Pp(e') (channel names match), p p satisfies 7, and that a 
satisfies 7', then (N, . . . , p, . . . , q, . . .} =>■ (N, . . . , p', . . . , q', . . .} where p' = 
(s r , n[, . . . , n' k ), n\ — p p (v) if Xi := w is in a, 7^ = in otherwise for i : 1, . . . , k; 
q' = (t',mi, . . . ,m' r ), m\ — a(v) if m :— v is in a', — rrii otherwise for 
i:l,...,r. 
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Definition 3 

An initial global configuration Gq has an arbitrary (but finite ) number of threads 
with local variables all set to _L. A run is a sequence GqG\ . . . such that Gi => Gi+i 
for i > 0. A global configuration G is reachable from Go if there exists a run from 
G to G. 

Example 1 

Let us consider a challenge and response protocol in which the goal of two agents 
Alice and Bob is to exchange a pair of new names (nA,ns), the first one created 
by Alice and the second one created by Bob, so as to build a composed secret key. 
We can specify the protocol by using new names to dynamically establish private 
channel names between instances of the initiator and of the responder. The TDL 
program in Figure ^ follows this idea. The thread I nit specifies the behavior of the 
initiator. He first creates a new name using the internal action fresh, and stores 
it in the local variable n^. Then, he sends ua on channel c (a constant), waits for 
a name y on a channel with the same name as the value of the local variable ua 
(the channel is specified by variable ha) and then stores y in the local variable 
rriA • The thread Resp specifies the behavior of the responder. Upon reception of 
a name x on channel c, he stores it in the local variable ns, then creates a new 
name stored in local variable tub and finally sends the value in vtlb on channel with 
the same name as the value of ns- The thread Main non-deterministically creates 
new thread instances of type Init and Resp. The local variable x is used to store 
new names to be used for the creation of a new thread instance. Initially, all local 
variables of threads Init/ Resp are set to _L. In order to allow process instances to 
participate to several sessions (potentially with different principals), we could also 
add the following rule 

stop a restart ) initA[nA '■= -L,m,A '■= -L] 

In this rule we require that roles and identities do not change from session to 
session. 3 Starting from Go = (No, {init, _L)), and running the Main thread we can 
generate any number of copies of the threads Init and Resp each one with a unique 
identifier. Thus, we obtain global configurations like 

(N, (init M ,±), 

(init A , h,-L, -L), . . . , (init a, in, -L, T), 
(init B , ijc+u -L) -L)> • • • , (init B , iic+L,-L, -L) ) 

where N — {_L, i%,..., ijc, ix+i, t-k+l} for K, L > 0. The threads of type Init 
and Resp can start parallel sessions whenever created. For K = 1 and L = 1 one 
possible session is as follows. 
Starting from 

({_!_, it, i 2 }, (init M , -L), {init A , h, -L, _L), (initB, h, JL, X)) 

3 By means of thread and fresh name creation it is also possible to specify a restart rule in which 
a given process takes a potential different role or identity. 
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Thread Init(local idA,nA,mA); 

initA l resh > genA [ua '■= new] 

genA > wait a [true] 

waitA " A ^ > stopA [m>A ■= y] 

Thread Resp(local id,nB,n%B); 

inits > gens [ns := x\ 

gens ^ re3h } readys [tub ■= new] 

readyB nB ^ rnB \ stopB [true] 

Thread Main{local x); 

initM create [x := new] 

create neWA > initM [run Init with id a '■= x,nA '■= -L,n%A ■= -L,x := _L] 

create > initu [run Hesp with ids ■= x,nB := J-,m_g := ±,x := ±b] 

Fig. 1. Example of thread definitions. 

if we apply the first rule of thread Init to (initA, ii, _L, -L) we obtain 

({_L, H,i 2 , a 1 }, (init M -, -L), {gen A , i u a 1 , ±), (init B , %i, -L, J-)) 

where a 1 is the generated name (a 1 is distinct from _L, i\, and 12). Now if we apply 
the second rule of thread Init and the first rule of thread Resp (synchronization 
on channel c) we obtain 

({_L, h,i 2 , a 1 }, (init M -, -L), (wait A , h, a 1 , ±>, {gen B ,i 2 , a 1 , ±}} 

If we apply the second rule of thread i?esp we obtain 

({_L,ii,i2,a\a 2 }, (mii M ,-L), (wait A , h, a 1 , _L), (ready B ,i2,a 1 ,a 2 )) 

Finally, if wc apply the last rule of thread /nit and i?esp (synchronization on 
channel a 1 ) wc obtain 

({-L,ii,i 2 ,a\a 2 }, {init M ,±), (stop A ,i 1 ,a 1 ,a 2 ), (stop B ,i 2 ,a 1 ,a 2 )) 

Thus, at the end of the session the thread instances i\ and i 2 have both a local 
copy of the fresh names a 1 and a 2 . Note that a copy of the main thread (initM, -L) 
is always active in any reachable configuration, and, at any time, it may introduce 
new threads (either of type Init or Resp) with fresh identifiers. Generation of fresh 
names is also used by the threads of type Init and Resp to create nonces. Fur- 
thermore, threads can restart their life cycle (without changing identifiers). Thus, 
in this example the set of possible reachable configurations is infinite and contains 
configurations with arbitrarily many threads and fresh names. Since names are 
stored in the local variables of active threads, the local data also range over an 
infinite domain. □ 
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2.2 Expressive Power of TDL 

To study the expressive power of the TDL language, we will compare it with the 
Turing equivalent formalism called Two Counter Machines. A Two Counters Ma- 
chine configurations is a tuple (£, c\ — n\,c 2 = n 2 ) where I is control location taken 
from a finite set Q, and n\ and n 2 are natural numbers that represent the values 
of the counters c\ and c 2 . Each counter can be incremented or decremented (if 
greater than zero) by one. Transitions combine operations on individual counters 
with changes of control locations. Specifically, the instructions for counter a are as 
follows 

Inc: l\\ Ci := a + 1; goto £ 2 ; 

Dec: l\\ if ^ > then a := Ci — 1; goto £ 2 ; else goto £3; 

A Two Counter Machine consists then of a list of instructions and of the initial 
state {£0, ci — 0, c 2 = 0). The operational semantics is defined according to the in- 
tuitive semantics of the instructions. Problems like control state reachability are 
undccidable for this computational model. 
The following property then holds. 

Theorem 1 

TDL programs can simulate Two Counter Machines. 
Proof 

In order to define a TDL program that simulates a Two Counter Machine we 
proceed as follows. Every counter is represented via a doubly linked list implemented 
via a collection of threads of type Cell and with a unique thread of type Last 
pointing to the head of the list. The i-th counter having value zero is represented 
as the empty list Cell(i, v, v), Last(i, v, w) for some name v and w (we will explain 
later the use of w). The i-th counter having value k is represented as 

Cell(i, v , v ),Cell(i, v , vi), . . . , C(i, v k -i, v k ), Last(i, v k , w) 

for distinct names Vo,v\, . . . ,v k - The instructions on a counter are simulated by 
sending messages to the corresponding Last thread. The messages are sent on 
channel Zero (zero test), Dec (decrement), and Inc (increment). In reply to each 
of these messages, the thread Last sends an acknowledgment, namely Yes/No for 
the zero test, DAck for the decrement, IAck for the increment operation. Last 
interacts with the Cell threads via the messages tstC, decC, incC acknowledged 
by messages z/nz, dack. iack. The interactions between a Last thread and the Cell 
threads is as follows. 

Zero Test Upon reception of a message (x) on channel Zero, the Last thread with 
local variables id, last, aux checks that its identifier id matches x - see transition 
from Idle to Busy - sends a message (id, last) on channel tstC directed to the cell 
pointed to by last (transition from Busy to Wait), and then waits for an answer. If 
the answer is sent on channel nz, standing for non-zero, (rcsp. z standing for zero) 
- see transition from Wait to AckNZ (resp. AckZ) - then it sends its identifier on 
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Thread Last(local id,last,aux); 



(Zero test) 

ZeroT(x) . , 

Idle > Busy [id = x\ 

tstCl(iddast) 

Busy > Wait 

Wait ng7{x \ AckNZ [id = x] 
Wait J^L, AckZ [id = x] 

AckZ I^X Idle 
AckNZ i(Ue 



(Decrement) 



Dec'?{x) - n 

Idle > Dbusy [id = x\ 



decC'.{id,laat) 

DBusy > DWait 



dack?(x,u) . r n 

DWait > DAck [id = x, last := u\ 



DAck DAM(M) , Idle 



(Increment) 



_ „ Xncllx) r Ar r . . -, 

Idle > INew \id = x\ 



IRun [aux := new] 
I Run run > IAck [run Cell with idc := id;prev := last; next := aux] 

IAck!{id) 

iylcfc > 7a/e [tost := a«a;J 

Fig. 2. The process defining the last cell of the linked list associated to a counter 



channel No (resp. Yes) as an acknowledgment to the first message - see transition 
from AckNZ (resp. Z) to Idle. As shown in Fig. |3J the thread Cell with local 
variables idc, prev, and next that receives the message tstC, i.e., pointed to by a 
thread Last with the same identifier as idc, sends an acknowledgment on channel 
z (zero) if prev = next, and on channel nz (non-zero) if prev ^ next. 

Decrement Upon reception of a message (x) on channel Dec, the Last thread with 
local variables id, last, aux checks that its identifier id matches x (transition from 
Idle to Dbusy), sends a message (id, last) on channel decC directed to the cell 
pointed to by last (transition from Busy to Wait), and then waits for an answer. 
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Thread Cell(local idc,prev,next); 
(Zero test) 

idle tstC ' acfcz [ x — idc, u — next,prev = next] 

. „ tstC?(x,u) r • t / l 

idle : — > ackN Z [x = idc, u = next,prev =p next] 

z\{idc) 

ackZ > idle 

ackNZ nz, ^ dc \ idle 

(Decrement) 

. „ dec?(x,u) 7 r • i / l 

idle ■ — > dec [x = idc,u = next,prev 7= next] 

dack\ (idc.prev) 

dec > idle 

Fig. 3. The process defining a cell of the linked list associated to a counter 

If the answer is sent on channel dack (transition from DWait to DAck) then it 
updates the local variable last with the pointer u sent by the thread Cell, namely 
the prev pointer of the cell pointed to by the current value of last, and then sends 
its identifier on channel DAck to acknowledge the first message (transition from 
DAck to Idle). 

As shown in Fig. |3J a thread Cell with local variables idc, prev, and next that 
receives the message decC and such that next — last sends as an acknowledgment 
on channel dack the value prev. 

Increment To simulate the increment operation, Last does not have to interact with 
existing Cell threads. Indeed, it only has to link a new Cell thread to the head of 
the list (this is way the Cell thread has no operations to handle the increment 
operation). As shown in Fig. |21 this can be done by creating a new name stored in 
the local variable aux (transition from INew to IRun) and spawning a new Cell 
thread (transition from IRun to IAck) with prev pointer equal to last, and next 
pointer equal to aux. Finally, it acknowledges the increment request by sending its 
identifier on channel IAck and updates variable last with the current value of aux. 

Two Counter Machine Instructions We are now ready to use the operations pro- 
vided by the thread Last to simulate the instructions of a Two Counter Machine. 
As shown in Fig. 01 we use a thread CM with two local variables id\, idi to repre- 
sent the list of instructions of a 2CM with counters c\, C2- Control locations of the 
Two Counter Machines are used as local states of the thread CM. The initial local 
state of the CM thread is the initial control location. The increment instruction 
on counter c; at control location £± is simulated by an handshaking with the Last 
thread with identifier idf. we first send the message Inc\{idi), wait for the acknowl- 
edgment on channel IAck and then move to state £2- Similarly, for the decrement 
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(Instruction : l\ : c\ := Ci + 1; goto £ 2 ; ) 

Incl.(idi) 

£1 > watte 1 

waitt 1 IAck -( x \ £ 2 j-j, _ j^j 



(Instruction : i\ : c.\ > then Ci := ci — 1; goto £ 2 ; else goto is; ) 

Zero!(id^} 

£i > wait il 

NZAck-!(x) 

waitt t > decii^ [x = idi\ 

, Dec\(idi) , 

dece 1 > wdece 1 

wdeci 1 DAck - l > y \ £ 2 _ j^j 

waitt t > £3 [a; = idi] 



Fig. 4. The thread associated to a 2CM. 



Thread Init(local nidi,pi,nid 2 ,p 2 ); 



mtt > imti [mdi := new] 

initi f reshp > i n it 2 \pi := new] 

init 2 runC \ initz [run Cell with idc := nidi;prev := pi; next := pi] 

initz runL > init± [run Last with idc := nidi; last := pi; aux := _L] 

initi > init 5 [md 2 := new] 

initz ^ reshp •, inittj [p 2 := new] 

inite runC \ initi [run Cell with idc := nid 2 ;prev := p 2 ; next := p 2 ] 

initr runL > initg [run Last with idc := nid 2 ; last := p 2 ; aux := _L] 

initg runCM , i n it g [run 2CM with idi := nidi;id 2 := nid 2 ] 



Fig. 5. The initialization thread. 

instruction on counter a at control location t\ we first send the message Zero\{idi). 
If we receive an acknowledgment on channel NZAck we send a Dec request, wait 
for completion and then move to £2- If we receive an acknowledgment on channel 
ZAck we directly move to £3. 
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Initialization The last step of the encoding is the definition of the initial state of the 
system. For this purpose, we use the thread Init of Fig. [5] The first four rules of Init 
initialize the first counter: they create two new names nidi (an identifier for counter 
ci) and pi, and then spawn the new threads Cell (nidi, pi,p\), Last(nidi,pi, _L). 
The following four rules spawns the new threads Cell(nid2,P2,P2), Last(nid2,p2, -L). 
After this stage, we create a thread of type 2CM to start the simulation of the in- 
structions of the Two Counter Machines. The initial configuration of the whole 
system is Go = {init, -L,-L). By construction we have that an execution step from 
{ti, ci = ni, C2 = U2) to {£2, ci — mi, C2 = ni2) is simulated by an execution run go- 
ing from a global configuration in which the local state of thread CM is {li, idi, 1^2) 
and in which we have ni occurrences of thread Cell with the same identifier idi 
for i : 1,2, to a global configuration in which the local state of thread CM is 
{l2,idi,id2) and in which we have mj occurrences of thread Cell with the same 
identifier idi f° r i : 1, 2. Thus, every executions of a 2CM M corresponds to an exe- 
cution of the corresponding TDL program that starts from the initial configuration 
G = (init, 1,1.). □ 

As a consequence of the previous theorem, we have the following corollary. 
Corollary 1 

Given a TDL program, a global configurations G, and a control location I, deciding 
if there exists a run going from Go to a global configuration that contains t (control 
state reachability) is an undecidable problem. 

3 From TDL to MSRatc 

As mentioned in the introduction, our verification methodology is based on a trans- 
lation of TDL programs into low level specifications given in MSRatc- Our goal is 
to extend the connection between CCS and Petri Nets IjGerman an d Sistla 1992jl 
to TDL and MSR so as to be able to apply the verification methods defined in 
IjDelzanno 2f)05|) to multithreaded programs. In the next section we will summarize 
the main features of the language MSRatc introduced in <|Delzanno 2f)01|) . 

3.1 Preliminaries on MSRnc 

iVG-constraints are linear arithmetic constraints in which conjuncts have one of 
the following form: true, x — y,x>y,x = c, or x > c, x and y being two variables 
from a denumerable set V that range over the rationals, and c being an integer. 
The solutions Sol of a constraint (p are defined as all evaluations (from V to Q) 
that satisfy (p. A constraint if is satisfiable whenever Sol(ip) ^ 0. Furthermore, ip 
entails tp whenever Sol(tp) C Sol((p). A^G-constraints are closed under elimination 
of existentially quantified variables. 

Let V be a set of predicate symbols. An atomic formula p(x\, . . . , x n ) is such that 
p G V , and x\, . . . ,x n are distinct variables in V. A multiset of atomic formulas is 
indicated as A± \ ... | Ak, where A; and Aj have distinct variables (we use variable 
renaming if necessary), and | is the multiset constructor. 
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In the rest of the paper we will use Ai, TV, ... to denote multisets of atomic formulas, 
e to denote the empty multiset, © to denote multiset union and to denote multiset 
difference. An MSRatc configuration is a multiset of ground atomic formulas, i.e., 
atomic formulas like p(d%, . . . ,d n ) where di is a rational for i : 1, . . . , n. 

An MSRjvc Title has the form Ai — > Ai' : ip, where Ai and Ai! are two (possibly 
empty) multisets of atomic formulas with distinct variables built on predicates in 
V , and ip is an iVC-constraint. The ground instances of an MSRjvc rule are defined 

as 

Inst(M — > Ai' : <p) = {<r{M) — ► a(Ai') \ a e Sol{ip)} 
where a is extended in the natural way to multisets, i.e., a(Ai) and a(Ai') are 
MSRjvc configurations. 

An MSRatc specification S is a tuple (V,1,1Z), where V is a finite set of predicate 
symbols, I is finite a set of (initial) MSRatc configurations, and 72. is a finite set of 
MSRjvc rules over V. 

The operational semantics describes the update from a configuration Ai to one of its 
possible successor configurations Ai'. Ai' is obtained from Ai by rewriting (modulo 
associativity and commutativity) the left-hand side of an instance of a rule into the 
corresponding right-hand side. In order to be fireable, the left-hand side must be 
included in Ai . Since instances and rules are selected in a non deterministic way, in 
general a configuration can have a (possibly infinite) set of (one-step) successors. 

Formally, a rule TL — ► B : ip from 1Z is enabled at Ai via the ground substitution 
a G Sol(<p) if and only if a(H) =4 Ai. Firing rule R enabled at Ai via a yields the 
new configuration 

Ai' = o-(B) ®{AiQ a(H)) 
We use Ai =^mSR M' to denote the firing of a rule at Ai yielding Ai' . 
A run is a sequence of configurations .Mo-A4i . . -Aik with Aia El such that 
Aii =>msr Aii+i for i > 0. A configuration Ai is reachable if there exists Aio G X 
such that Aio =>aisr M., where =>MSR is the transitive closure of =^msr- Fi- 
nally, the successor and predecessor operators Post and Pre are defined on a set 
of configurations S as Post(S) = {A^'IA^ =>MSR -M',A4 G S} and Pre(S) = 
{A^IA^ =>msr Ai',M' G S}, respectively. Pre* and Post* denote their transitive 
closure. 

As shown in i|Delzanno 20011 IBozzano and Delzanno 2002J1 , Petri Nets represent a 
natural abstractions of MSRjvc (and more in general of MSR rule with constraints) 
specifications. They can be encoded, in fact, in propositional MSR specifications 
(e.g. abstracting away arguments from atomic formulas). 

3.2 Translation from TDL to MSR NC 

The first thing to do is to find an adequate representation of names. Since all we 
need is a way to distinguish old and new names, we just need an infinite domain 
in which the = and ^ relation are supported. Thus, we can interpret names in Af 
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either as integer of as rational numbers. Since operations like variable elimination 
are computationally less expensive than over integers, we choose to view names 
as non-negative rationals. Thus, a local (TDL) configuration p — (s, n\, . . . , rik) 
is encoded as the atomic formula p* — s(nx, . . . , rifc), where n, is a non-negative 
rational. Furthermore, a global (TDL) configuration G = (N,pi, . . . ,p m ) is encoded 
as an MSRatc configuration G* 

Px\ ■■■ \Pm\ fresh(n) 

where the value n in the auxiliary atomic formula fresh(n) is an rational number 
strictly greater than all values occurring in p*, . .. ,p* m . The predicate fresh will 
allow us to generate unused names every time needed. 

The translation of constants C — {c%, . . . , c m }, and variables is defined as follows: 
x* = x for x G V, _L* = 0, e* = i for i : 1, . . . , m. We extend ■* in the natural way 
on a guard 7, by decomposing every formula i/e into x < e* and x > e*. We will 
call 7* the resulting set of TVC-constraints. 4 

Given V = {xi, . . . , x/-}, we define V as the set of new variables {xi, . . . , x' k }. 
Now, let us consider the assignment a defined as x\ :— e\,...,Xk := (we add 
assignments like xi := Xi if some variable does not occur as target of a). Then, a' 
is the TVC-constraint x[ = e*, . . . , x' k = e". 

The translation of thread definitions is defined below (where we will often refer to 
Example P). 

Initial Global Configuration Given an initial global configuration consisting of the 
local configurations {si,nn, . . . ,nik { ) with riy = _L for i : 1,. .. ,u, we define the 
following MSRatc rule 

init — ► si(xn,...,xiki) I ■•■ I s u (x u i,...,x u k u ) \ fresh{x) : 
x > C, x n = 0, . . . , x ukl[ = 

here C is the largest rational used to interpret the constants in C. 

For each thread definition P — (Q, sq, V, R) in T with V — {x\, . . . , Xk} we translate 
the rules in R as described below. 

Internal Moves For every internal move s — s'[y, a], and every v G 7* we define 
s(x%,. . .,x k ) -> s'(x[, ...,x' k ) : u,a' 

Name Generation For every name generation s — s'[x.i :— new], we define 
s(x 1 ,. ..,x k ) I fresh(x) -> s' (x^, ...,x' k ) \ fresh(y) : y > x-, x\ > x, J\ x'^ = Xj 

For instance, the name generation init a ^ resh -, genA[n :— new] is mapped into the 
MSRatc rule initAiid, x,y)\ fresh(u) — > genA{id' ,x' ,y') \ fresh(u') : tp where ip 

4 As an example, if 7 is the constraint x \,x / z then 7* consists of the two constraints 
x = 1, x > z and x = 1, z > x. 
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is the constraint u' > x',x' > u, y' = y, id' = id. The constraint x 1 > u represents 
the fact that the new name associated to the local variable n (the second argument 
of the atoms representing the thread) is fresh, whereas u' > x' updates the current 
value of fresh to ensure that the next generated names will be picked up from 
unused values. 

Thread Creation Let P = (Q 1 , to, V', R') and V — {y%, . . . , y u }. Then, for every 
thread creation s — s' [run P with a] , we define 

s(x 1 , ...,Xfc) -> s'(x[, ...,x' k ) | t(y[, ...,y' u ) : x[ = x 1} ...,x' k = x k ,a*. 

E.g., consider the rule create neWA > initM[run Init with id :— x, . . .] of Example^ 
Its encoding yields the MSRjvc rule create(x) — > init Mix'") \ initA(id! ,n' , m') 
where i/j represents the initialization of the local variables of the new thread x' = 
x, id' = x, n' = 0, m! = 0. 

Rendez-vous The encoding of rendez-vous communication is based on the use of 
constraint operations like variable elimination. Let P and P' be a pair of thread 
definitions, with local variables V = {x\ : . . . , x k } and V — {j/i, . . . , y{\ with V f) 

V = 0. We first select all rules s e ' m > s'[y, a] in R and t e — > t'^y', a'] in R', 
such that m — (wi, . . . , w u ), m' — {w[, . . . , w' v ) and u = v. Then, we define the new 
MSRjvc rule 

s(xx,.. .,x k ) | t(yx,.. .,yi)-> s'(x' 1: ...,x' k ) | t'(y[,.. . ,y[) : ip 

for every v G 7* and v 1 £ 7'* such that the NC-constraint (p obtained by eliminat- 
ing w'i, . . . , w' v from the constraint v A v' A a' A a" A W\ = w[ A . . . A w v — w' v 

is satisfiable. For instance, consider the rules wait a — - stopA[mA '■= y] and 

ready b — — - — — - * stopB[true\. We first build up a new constraint by conjoining the 
NC-constraints y — ms (matching of message templates) , and nA — ns, m' A = 
y,n A = nA,m' B = mB,n' B = nB,id[ — idi,id' 2 — id 2 (guards and actions of 
sender and receiver). After eliminating y we obtain the constraint <p defined as 
ns = nA,m' A — m,B,n' A — nA,m' B = ms,n' B — nB,id[ — id\,id! 2 — ic?2 defined 
over the variables of the two considered threads. This step allows us to symbolically 
represent the passing of names. After this step, we can represent the synchroniza- 
tion of the two threads by using a rule that simultaneously rewrite all instances 
that satisfy the constraints on the local data expressed by ip, i.e., we obtain the 
rule 

wait a (id\, nA , n~iA ) readys{id2, ne, mj) — > 

stopA{idi,n' A ,m A ) \ stopB(id' 2 ,n' B ,m' B ) : ip 

The complete translation of Example ^ is shown in Fig. El (for simplicity we have 
applied a renaming of variables in the resulting rules). An example of run in the 
resulting MSRatc specification is shown in Figure Note that, a fresh name is 
selected between all values strictly greater than the current value of fresh (e.g. in 
the second step 6 > 4), and then fresh is updated to a value strictly greater than 
all newly generated names (e.g. 8 > 6 > 4). 
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init — > fresh(x) | mitjw(y) : x > 0,y — 0. 

fresh(x) \ initM^y) — ► fresh(x') \ create(y') : x' > y , y' > x. 

create(x) — > initM{x') | initA(id' ,n' ,m') : x' — x,id' — x,n' = 0, m! = 0. 

create(x) — ► initM(x') \ inits(id! ,n' ,m') : x' — x,id' = x,n' = 0,m' = 0. 

initA{id,n,m)\ fresh(u) — > gen,A{id,n' ,m) \ fresh(u') : u' > n' ,n' > u. 

genA{idi,n,m)\ initB{id2,u,v) — > waitA{idi,n,m) | genB(id'2,u ,v') : u = n, v' — v 

geriB{id,n,m)\ fresh(u) — > ready B(id,n,m') | fresh(u) : u' > m',m' > it. 

waitA(idi,n, m)\ ready s{id2,u,v) — ► stopA(idi,n,m') \ stops{id2,u,v) : n — u,m' = v. 

stopA{id,n,m) — ► initA(id' ,n' ,m') : n' = 0, w! = 0,id' = id. 

stopB(id,n,m) — > initB{id' ,n' ,m') : n' — 0, m' = 0, id' — id. 

Fig. 6. Encoding of Example^ for simplicity we embed constraints like x — x' into 
the MSR formulas. 

init =$> ... => fresh(4) \ init M (0) \ init a{2, 0,0) | init b (3, 0,0) 
=^ fresh(8) | init M (0) \ gen A {2,6,0) \ init B (3,0,0) 
fresh(8) \ init M {0) \ wait a{2, 6,0) \ gen B (3, 6,0) 
=> . . . fresh(l6) \ init M (0) \ wait A (2, 6,0) \ gen B (3,6,0) \ init A (ll, 0,0) 

Fig. 7. A run in the encoded program. 

Let T = (Pi, . . . , P t ) be a collection of thread definitions and Go be an initial 
global state. Let S be the MSIInc specification that results from the translation 
described in the previous section. 

Let G = (N,pi, . . . ,p n ) be a global configuration with pi = (si, Vn, . . . , «ifc 4 ), and 
let h : N Q+ be an injective mapping. Then, we define G'(h) as the MSRjvc 
configuration 

si(h(vn),...,h(vi kl )) ... | s n (h(v n x),...,h(v nkn )) | fresh(v) 

where v is a the first value strictly greater than all values in the range of h. Given 
an MSRatc configuration A4 defined as si(wn, . . . , vik t ) | ■ ■ ■ | s n (v n \, . . . , v n k n ) 
with Sij G Q+, let V{M) C Q + be the set of values occurring in M. Then, given a 
bijective mapping / : V{M) ^ N C J\f, we define Ai'(f) as the global configura- 
tion (N,pi, ...,p n ) where pt = (sj, /(««), . . . , /(«ifcj). 
Based on the previous definitions, the following property then holds. 

Theorem 2 

For every run GqG\ ... in T with corresponding set of names NqN\ . . ., there exist 
sets DqD\ . . . and bijective mappings hghi . . . with hi : Ni ^ Di Q Q + for i > 0, 
such that init G'(ho)Gl(hi) ... is a run of S. Vice versa, if init MqA4i ... is a 
run of S, then there exist sets NqNi ... in A/" and bijective mappings /o/i • ■ • with 
/; : V(Mi) ~» A 4 for i > 0, such that ^5(/o)A<*(/i) ... is a run in T. 

Proof 

We first prove that every run in T is simulated by a run in 5. 

Let Go ... Gi be a run in T, i.e., a sequence of global states (with associated set 
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of names N ■ ■ ■ N) such that Gi => Gi + \ and N C iVj+i for i > 0. 
We prove that it can be simulated in <S by induction on its length I. 
Specifically, suppose that there exist sets of non negative rationals Dq . . . Di and 
bijective mappings ho ■ . . hi with hi : N ~> Z)j for < i < I, such that 

init G Q (h ) . . .Gi(hi) 
is a run of S. Furthermore, suppose Gi => 

We prove the thesis by a case-analysis on the type of rule applied in the last step 
of the run. 

Let Gi = (N,pi, . . . ,p r ) and pj = (s, m, . . . , n k ) be a local configuration for the 
thread definition P = (Q, s, V, R) with V = {x\, . . . , x k } and rij G Ni for i : 1, . . . , k. 
Assignment Suppose there exists a rule s — s'[y, a] in R such that p p satisfies 7, 
Gi = (N u . . . , Pj , ...)=► (JV I+1) . . . . . .) = G,+i 7V ; = N + i, p'j = (s',n[, . . .,n' k ), 

and if Xi := yi occurs in a, then n\ = p Pj (yi), otherwise n\ = rii for i : 1, . . . , k. 
The encoding of the rule returns one MSRmc rule having the form 

s(xi, ...,x k )^> s'(zi, ...,x' k ): 7', 3 

for every 7' e 7. 

By inductive hypothesis, Gi(hi) is a multiset of atomic formulas that contains the 
formula s(hi(ni), . . . , hi(nk)). 

Now let us define hi+\ as the mapping from Ni to Di such that /i;+i(r^) = hi(nj) if 
x.j := .Xj is in a and /i; + i(nQ = if x. t := _L is in a. Furthermore, let us the define 
the evaluation 

a = (xi 1 > ftj(ni), . . . ,x fc i-> hi(n k ),x[ ^ hi +1 {n' x ), ...,x' k ^ hi +1 (n' k )) 

Then, by construction of the set of constraints 7 and of the constraint 3, it follows 
that a is a solution for 7', 3 for some 7' 6 7. As a consequence, we have that 

s(m, . . .,n k ) -> s'(ni, . . . ,n' fc ) 

is a ground instance of one of the considered MSRnc rules. 

Thus, starting from the MSRnc configuration Gi(hi), if we apply a rewriting step 
we obtain a new configuration in which s(ni, . . . , rife) is replaced by s'(ni, . . . , n' k ), 
and all the other atomic formulas in Gi + \(hi + i) are the same as in Gi(hi). The 
resulting MSRnc configuration coincides then with the definition of Gi + x(hi +1 ). 
Creation of new names Let us now consider the case of fresh name generation. 
Suppose there exists a rule s -^-> s'[xi := new] in R, and let n £ N, and suppose 
(N, . . . , Pj , ...)=► (JV,+i, . . . .pJ, . . .) where N l+1 = N t U pj = (s'X, . . . ,n' k ) 
where n[ = n, and n'^ = nj for j ^ i. 

We note than that the encoding of the previous rule returns the MSRnc rule 

s(x!,...,x k ) I fresh(x) -> s'^, . . . , a;' fe ) | fresh(x') : tp 

where ip consists of the constraints y > x'^x\ > x and x'^ = Xj for j ^ i. By 
inductive hypothesis, Gi(hi) is a multiset of atomic formulas that contains the 
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formulas s{hi{n{), . . . , hi(rik)) and fresh(v) where hi is a mapping into Di, and v 
is the first non-negative rational strictly greater than all values occurring in the 
formulas denoting processes. 

Let v be a non negative rational strictly greater than all values in Di. Furthermore, 
let us define v' = v + 1 and Di + i = Di U {v, v'}. 

Furthermore, we define hi + i as follows hi + \{n) = hi(n) for n E iVj, and /i; + i(n^) = 
hi+i(n) = v'. Furthermore, we define the following evaluation 

a =( xi->v,xi\->hi(ni),...,x k i-^hi(n k ), 

x' h-> v',x[ i * /i i+ i(ni), . . . ,:r' fe h-> /i (+1 (n' fe ) ) 

Then, by construction of a and S, it follows that a is a solution for a. Thus, 

s(m,...,n k ) I fresh(v) -> s' (n' l7 . . . ,n' k ) \ fresh(v') 

is a ground instance of the considered MSRmc rule. 

Starting from the MSRnc configuration Gi(hi), if we apply a rewriting step we 
obtain a new configuration in which s(ni, . . . , n k ) and fresh(v) are substituted by 
s'(n' l7 . . . ,n' k ) and fresh(v'), and all the other atomic formulas in Gi+i(hi+i) are 
the same as in Gi(hi). We conclude by noting that this formula coincides with the 
definition of Gi + \(hi +1 ). 

For sake of brevity we omit the case of thread creation whose only difference from 
the previous cases is the creation of several new atoms instead (with values obtained 
by evaluating the action) of only one. 

Rendez-vous Let Pi = (s, m, . . . , n k ) and pj = (t, mi, . . . , m u ) two local configura- 
tions for threads P ^ P', rn € iVj for i : 1, . . . , k and rrii G Ni for i : 1, . . . , u. 

Suppose s s'py, a] and t c ' m > ^[7', a'], where m = (x il , . . . , x iv ), and m' = 
(yi, ... , y„) ( all defined over distinct variables) are rules in R. 
Furthermore, suppose that p Pi satisfies 7, and that p' (see definition of the oper- 
ational semantics) satisfies 7', and suppose that G\ = (iVj, . . . ,Pi, . . . ,pj, . . .) =>■ 
(-Wj+i) • . . . . . ,p'j, ■ ■ .) = G i+ i, where iVj+i = N t , p\ = (s', n[, . . . , n' k ), p'j = 
(t',mi, . . . , m' u ), and if Xi :— e occurs in a, then — p Pi (e), otherwise v! i = ni 
for i : 1, . . . , k; if Wj := e occurs in a', then = p'(e), otherwise m\ — mi for 
i : 1, . . . , u. 

By inductive hypothesis, Gi(hi) is a multiset of atomic formulas that contains the 
formulas s(/i;(m), . . . ,hi{n k )) and t(hi(mi), hi(m u )). 

Now, let us define hi + \ as the mapping from Ni to such that /i;+i(n-) = hi(rij) 
if Xi := Xj is in a, /i; + i(m-) = /i;( m j) if u « : ~ u j i s m a 'i ^z+i( n 9 = if Xj := _L is 
in a, /i; + i(m-) = if := _L is in a'. 

Now, let us define a as the evaluation from Ni to Di such that 

(7 = (Tl U 02 

(7i = (xi 1 ^ /ij(ni), . . . ,x fc hi(n k ),ui ^ hi(mi), ...,u u i-> hi(m u )) 

0-2 = (x[ ' ^ /i (+ i«),. . . ,x' fc i-> /i z+1 (n' fe ),M , 1 i-> fti +1 (mi),.. .,<i-> h l+1 (m' u )). 

Then, by construction of the sets of constraints 7, 7', a and a' it follows that a is 

a solution for the constraint 3w[ Bw' p .9 A 9' A a A a' A wi = w' 1 A . . . A w p = w' p 

for some e 7 and 9' e 7'. Note in fact that the equalities Wi = express the 



Constraint-based Verification of Abstract Multithreaded Programs 19 

passing of values denned via the evaluation p' in the operational semantics. 
As a consequence, 

s(ni, ...,n k ) | t(mi, . . . ,m u ) -» s'(ni, . . . ,n' fe ) | t'fm^, . . . , m^) 

is a ground instance of one of the considered MSRmc rules. 

Thus, starting from the MSRnc configuration Gi(hi), if we apply a rewriting 
step we obtain a new configuration in which s(ni, . . . has been replaced by 
s'(n' 1; . . . , and t'(m'i, . . . , m' fc ) has been replaced by t(rn'i, . . . , m' u ), and all the 
other atomic formulas are as in Gi(h{). This formula coincides with the definition 
of Gi+i(h l+ i). 

The proof of completeness is by induction on the length of an MSR run, and by 
case-analysis on the application of the rules. The structure of the case analysis is 
similar to the previous one and it is omitted for brevity. □ 

4 Verification of TDL Programs 

Safety and invariant properties are probably the most important class of correctness 
specifications for the validation of a concurrent system. For instance, in Example 
[l]we could be interested in proving that every time a session terminates, two in- 
stances of thread Init and Resp have exchanged the two names generated during 
the session. To prove the protocol correct independently from the number of names 
and threads generated during an execution, we have to show that from the ini- 
tial configuration Go it is not possible to reach a configuration that violates the 
aforementioned property. The configurations that violate the property are those in 
which two instances of Init and Resp conclude the execution of the protocol ex- 
changing only the first nonce. These configurations can be represented by looking 
at only two threads and at the relationship among their local data. Thus, we can 
reduce the verification problem of this safety property to the following problem: 
Given an initial configuration Go we would like to decide if a global configura- 
tion that contains at least two local configurations having the form (stop a, i, n, m) 
and (stops, i' ,n' ,m') with n' = n and m ^ m' for some ,n,n' ,m,m' is reach- 
able. This problem can be viewed as an extension of the control state reachability 
problem defined in ( Abdull a"and Nylen 2000| ) in which we consider both control lo- 
cations and local variables. Although control state reachability is undecidable (see 
Corollary^, the encoding of TDL into MSRatc can be used to define a sound and 
automatic verification methods for TDL programs. For this purpose, we will exploit 
a verification method introduced for MSR(C) in (Del zanno 20011 lDelzanno" 2005l. 
In the rest of this section we will briefly summarize how to adapt the main results 
in ( Delzan no 20011 iDelzanno 2005J1 to the specific case of MSRjvc- 

Let us first reformulate the control state reachability problem of Example ^ f° r 
the aforementioned safety property on the low level encoding into MSRjvc- Given 
the MSRatc initial configuration init we would like to check that no configuration 
in Post* ({init}) has the following form 



{stop a (ai, vi,w{), stops (a,2,v 2, w 2 )} © M 
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for dj, Vi, Wi € Q i : 1, 2 and an arbitrary multiset of ground atoms yVf . Let us call 
?7 the set of bad MSRnc configurations having the aforementioned shape. Notice 
that U is upward closed with respect to multiset inclusion, i.e., if M. 6 U and 
M =4 A4', then M.' G U. Furthermore, for if U is upward closed, so is Pre(U). 
On the basis of this property, we can try to apply the methodology proposed in 
( |Abdulla and Nylen 2000| ) to develop a procedure to compute a finite representation 
R of Pre*U). For this purpose, we need the following ingredients: 

1. a symbolic representation of upward closed sets of configurations (e.g. a set 
of assertions S whose denotation [5] is U); 

2. a computable symbolic predecessor operator SPre working on sets of formulas 
such that {SPre(S)l = Pre([S\); 

3. a (decidable) entailment relation Ent to compare the denotations of symbolic 
representations, i.e., such that Ent(N, M) implies [AT] C [A/]. If such a re- 
lation Ent exists, then it can be naturally extended to sets of formulas as 
follows: Ent s (S, S') if and only if for all N G S there exists M 6 S' such that 
Ent(N, M) holds (clearly, if Ent is an entailment, then Ent (S, S') implies 
{Sj C [5']). 

The combination of these three ingredients can be used to define a verification 
methods based on backward reasoning as explained next. 

Symbolic Backward Reachability Suppose that Mi , . . . , M n are the formulas of our 
assertional language representing the infinite set U consisting of all bad configura- 
tions. The symbolic backward reachability procedure (SBR) procedure computes a 
chain {Ii}i>o of sets of assertions such that 

I = {M 1 ,...,M n } 

J i+ l = Ii U SPre(Ii) for i > 

The procedure SBR stops when SPre produces only redundant information, i.e., 
Ent s (I i+1 ,Ii). Notice that Ent s I i+ i) always holds since Ii C 

Symbolic Representation In order to find an adequate represention of infinite sets of 
MSRjvc configurations we can resort to the notion of constrained configuration in- 
troduced in (Delzanno 2001) for the language scheme MSR(C) defined for a generic 
constraint system C. We can instantiate this notion with NC constraints as follows. 
A constrained configuration over V is a formula 

Pl(xn, . . . ,Xi kl ) | ■■■ \p n {x n i,. . . ,X nkn ) : Lp 

where p\, . . . ,p n G P, xn, . . . , G V for any i : 1, . . . n and <p is an iVC-constraint. 
The denotation a constrained configuration M = (A4 : <p) is defined by taking the 
upward closure with respect to multiset inclusion of the set of ground instances, 
namely 

[M] = {M' | <j(M) 4 M', o~ G Sol(ip)} 
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This definition can be extended to sets of MSRatc constrained configurations with 
disjoint variables (we use variable renaming to avoid variable name clashing) in the 
natural way. 

In our example the following set Sjj of MSRatc constrained configurations (with 
distinct variables) can be used to finitely represent all possible violations U to the 
considered safety property 

Su — { stopA(ii,ni,mi) \ stopB(i2,n 2 ,m 2 ) : n\ = n 2 , mi > m 2 
stopA(ii,ni,mi) \ stopB(i 2 ,n 2 ,m 2 ) : n x = n 2 ,m 2 > mi} 

Notice that we need two formulas to represent m\ ^ m 2 using a disjunction 
of > constraints. The MSRatc configurations stops {1,2, 6) | sfopyt(4, 2,5), and 
stops(l,2,6) | stopA^: 2,5) | waitA{2, 7, 3) are both contained in the denotation 
of Sjj- Actually, we have that {Su} = U. This symbolic representation allows us to 
reason on infinite sets of MSRatc configurations, and thus on global configurations 
of a TDL program, forgetting the actual number or threads of a given run. 

To manipulate constrained configurations, we can instantiate to AC-constraints 
the symbolic predecessor operator SPre defined for a generic constraint system in 
IjDelzanno 2005|) . Its definition is also given in Section [Appendix A| in Appendix. 
From the general properties proved in ( Delzanno 2005 ) , we have that when applied 
to a finite set of MSRatc constrained configurations S, SPre^c returns a finite set 
of constrained configuration such that \SPreNc{S)\ = P r e{\S\), i.e., SPreNc{S) 
is a symbolic representation of the immediate predecessors of the configurations in 
the denotation (an upward closed set) of S. Similarly we can instantiate the generic 
entailment operator defined in (Delzanno 2005) to MSRatc constrained configura- 
tions so as to obtain an a relation Ent such that EntNc{N, M) implies [A] C [M], 
Based on these properties, we have the following result. 

Proposition 1 

Let T be a TDL program with initial global configuration Go, Furthermore, let 
S be the corresponding MSRatc encoding, and Su be the set of MSRatc con- 
strained configurations denoting a given set of bad TDL configurations. Then, 
init <f_ SPre* NC (Su) if and only if there is no finite run Go...G n and map- 
pings ho,-..,h n from the names occurring in G to non- negative rationals such 
that init' G' {h ) . . . G' n {h n ) is a run in S and G' n {h n ) £ [£/]. 

Proof 

Suppose init SPre* NC (U). Since ISPreNc(S)] = pre ([5]) for any S, it follows 
that there cannot exist runs initMo . . . M n in S such that M n E [£/]. The thesis 
then follows from the Theorem [21 □ 

As discussed in (Bozza no and Delzanno 2002(1 . we have implemented our verifica- 
tion procedure based on M SR and linear constraints using a CLP system with linear 
arithmetics. By the translation presented in this paper, we can now reduce the ver- 
ification of safety properties of multithreaded programs to a fixpoint computation 
built on constraint operations. As example, we have applied our CLP-prototype 
to automatically verify the specification of Fig. El The unsafe states are those de- 
scribed in Section^] Symbolic backward reachability terminates after 18 iterations 
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and returns a symbolic representation of the fixpoint with 2590 constrained con- 
figurations. The initial state init is not part of the resulting set. This proves our 
original thread definitions correct with respect to the considered safety property. 

4-1 An Interesting Class of TDL Programs 

The proof of Theorem ^ shows that verification of safety properties is undecid- 
able for TDL specifications in which threads have several local variables (they 
are used to create linked lists). As mentioned in the introduction, we can ap- 
ply the sufficient conditions for the termination of the procedure SBR given in 
i|Bozzano and Delzanno 20021 IDelzanno~2 005 ) to identify the following interesting 
subclass of TDL programs. 

Definition 4 

A monadic TDL thread definition P = (Q, s,V, R) is such that V is at most a 
singleton, and every message template in R has at most one variable. 

A monadic thread definition can be encoded into the monadic fragment of MSRjvc 
studied in ( Delzan no 2005j) . Monadic MSRjvc specifications are defined over atomic 
formulas of the form p or p(x) with p is a predicate symbol and a: is a variable, and 
on atomic constraints of the form x = y, and x > y. To encode a monadic TDL 
thread definitions into a Monadic MSRjvc specification, we first need the following 
observation. Since in our encoding we only use the constant 0, we first notice that 
we can restrict our attention to MSRjvc specifications in which constraints have no 
constants at all. Specifically, to encode the generation of fresh names we only have 
to add an auxiliary atomic formula zero(z), and refer to it every time we need to 
express the constant 0. As an example, we could write rules like 

init — ► fresh(x) \ initu^y) \ zero(z) : x > z,y = z 

for initialization, and 

create(x) \ zero(z) — > initM{x') \ initA(id',n',m') \ zero(z) : 

x' = x, id' = x, n' = z, m' = z, z' = z 

for all assignments involving the constant 0. By using this trick an by following the 
encoding of Section the translation of a collection of monadic thread definitions 
directly returns a monadic MSRjvc specification. By exploiting this property, we 
obtain the following result. 

Theorem 3 

The verification of safety properties whose violations can be represented via an 
upward closed set U of global configurations is decidable for a collection T of 
monadic TDL definitions. 

Proof 

Let S be the MSRatc encoding of T and Su be the set of constrained configuration 
such that Su — U. The proof is based on the following properties. First of all, the 
MSRjvc specification S is monadic. Furthermore, as shown in (JDelzanno 20051. 
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the class of monadic MSRjvc constrained configurations is closed under application 
of the operator SPrejyc- Finally, as shown in IjDelzanno 2005J) . there exists an 
entailment relation CEnt for monadic constrained configurations that ensures the 
termination of the SBR procedure applied to a monadic MSR/vc specification. 
Thus, for the monadic MSRjvc specification S, the chain defined as Iq = Sjj 1 
7j + i = JjU SPre(Ii) always reaches a point k > 1 in which CEnt s (Ik+i,Ik), i-e. 
[7fc] is a fixpoint for Pre. Finally, we note that we can always check for membership 
of init in the resulting set Ik- D 

As shown in ((Schnocbclcn 2002), the complexity of verification methods based on 
symbolic backward reachability relying on the general results in ( |Abdulla and Nylen 2 000 
Fink el and Schnoebelen 200 l|l is non primitive recursive. 

5 Conclusions and Related Work 

In this paper we have defined the theoretical grounds for the possible application 
of constraint-based symbolic model checking for the automated analysis of abstract 
models of multithreaded concurrent systems providing name generation, name mo- 
bility, and unbounded control. Our verification approach is based on an encoding 
into a low level formalism based on the combination of multiset rewriting and 
constraints that allows us to naturally implement name generation, value passing, 
and dynamic creation of threads. Our verification method makes use of symbolic 
representations of infinite set of system states and of symbolic backward reacha- 
bility. For this reason, it can be viewed as a conservative extension of traditional 
finite-state model checking methods. The use of symbolic state analysis is strictly 
related to the analysis methods based on abstract interpretation. A deeper study 
of the connections with abstract interpretation is an interesting direction for future 
research. 

Related Work The high level syntax we used to present the abstract models of 
multithreaded programs is an extension of the communicating finite state machines 
used in protocol verification IjBochmann 1978j) . and used for representing abstrac- 
tion of multithreaded software programs l|Ball et al. 2001jl . In our setting we enrich 
the formalism with local variables, name generation and mobility, and unbounded 
control. Our verification approach is inspired by the recent work of Abdulla and 
Jonsson. In (|Abdulla and Jonsson 2003)l . Abdulla and Jonsson proposed an asser- 
tional language for Timed Networks in which they use dedicated data structures 
to symbolically represent configurations parametric in the number of tokens and 
in the age (a real number) associated to tokens. In (Abd ulla and Nylen 2000| |, Ab- 
dulla and Nylen formulate a symbolic algorithm using existential zones to rep- 
resent the state-space of Timed Petri Nets. Our approach generalizes the ideas of 
l|Abdulla and Jonsson 20031 1 Abdulla and Nylen 20 00) to systems specified via mul- 
tiset rewriting and with more general classes of constraints. In QAbdulla and Jonsson 2 001). 
the authors apply similar ideas to (unbounded) channel systems in which messages 
can vary over an infinite name domain and can be stored in a finite (and fixed a 
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priori) number of data variables. However, they do not relate these results to multi- 
threaded programs. Multiset rewriting over first order atomic formulas has been pro- 
posed for specifying security protocols by Cervesato et al. in l|Cervesato et al. 1999 ). 
The relationships between this framework and concurrent languages based on pro- 
cess algebra have been recently studied in (Bistarclli et al. 2005|l . Apart from ap- 
proaches based on Petri Net-like models (as in IjGerman and Sistla 1992llBall et al. 200111 ). 
networks of finite-state processes can also be verified by means of automata the- 
oretic techniques as in ( |Bouajjani et al. 2000| ). In this setting the set of possible 
local states of individual processes are abstracted into a finite alphabet. Sets of 
global states are represented then as regular languages, and transitions as relations 
on languages. Differently from the automata theoretic approach, in our setting 
we handle parameterized systems in which individual components have local vari- 
ables that range over unbounded values. The use of constraints for the verification 
of concurrent systems is related to previous works connecting Constraint Logic 
Programming and verification, see e.g. IjDelzanno and Podelski 1999J1 . In this set- 
ting transition systems are encoded via CLP programs used to encode the global 
state of a system and its updates. In the approach proposed in (|Delzanno 20011 
IBozzano and Delzanno 2002|) . we refine this idea by using multiset rewriting and 
constraints to locally specify updates to the global state. In (Pelzamio 20© , we 
defined the general framework of multiset rewriting with constraints and the corre- 
sponding symbolic analysis technique. The language proposed in l|Delzanno 2001J) is 
given for a generic constraint system C (taking inspiration from CLP the language 
is called MSR(C)). In (Boz zano and Delzanno 2002(1 . we applied this formalism to 
verify properties of mutual exclusion protocols (variations of the ticket algorithm) 
for systems with an arbitrary number of processes. In the same paper we also for- 
mulated sufficient conditions for the termination of the backward analysis. The 
present paper is the first attempt of relating the low level language proposed in 
IjDelzanno 2001(1 to a high level language with explicit management of names and 
threads. 
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Appendix A Symbolic Predecessor Operator 

Given a set of MSRatc configurations S, consider the MSRatc predecessor operator 
Pre(S) = {M\M ^>msr M, M 1 G S}. In our assertional language, we can define a 
symbolic version SPreNC of Pre defined on a set S containing MSRatc constrained 



2G 



Giorgio Delzanno 



multisets (with disjoint variables) as follows: 

SPre NC {S) = { {A<S)Af : I (A — » B : tp) E K, {M : if) E S, 

M' 4M, B' 4 B, 

[M'-.tp) =e {B':ip), M = MQ M', 
£ = (3xi x k .6) 

and x±, . . . , Xk are all variables not in A © Af}. 

where =g is a matching relation between constrained configurations that also takes 
in consideration the constraint satisfaction, namely 

(A x \ ... \A n :ip) = e (Bi\ ... | B m : i/j) 

provided m — n and there exists a permutation ji,.-.,jn of 1, . . . ,n such that 
the constraint 6 = ip A tp A A"=i = ^ s satisfiable; here . . . , av) = 

q(yi, . . . , y s ) is an abbreviation for the constraints x± = yi A . . . A x r = y s if p = q 
and s = r, false otherwise. 

As proved in (Delzan no 2005J) . the symbolic operator SPre^c returns a set of 
MSRjvc constrained configurations and it is correct and complete with respect to 
Pre, i.e., [SPre^fc (S)J — Pre([SJ) for any S. It is important to note the difference 
between SPre^c and a simple backward rewriting step. 

For instance, given the constrained configurations M defined as p(x, z) \ f(y) : z > 
y and the rule s{u,m) \ r(t,v) — > p(u',m') \ r(t',v') : u = t,m' = v,v' = 
v,u' = u,t' = t (that simulates a rendez-vous (u, t are channels) and value passing 
(to' = v)), the application of SPre returns s(u, m) \ r[t, v) \ f(y) : u = t, v > y as 
well as s(u, m) \ r(t, v) \ p(x, z) \ f(y) : u = t, x > y (the common multiset here is 

e). 



